Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Mixed precision neural network quantization method based on Octave convolution

ZHANG Wenye, SHANG Fangxin, GUO Hao

Journal of Computer Applications 2021, 41 (5): 1299-1304. DOI: 10.11772/j.issn.1001-9081.2020071106

Abstract （311）

PDF （2485KB）（300）

Save

Deep neural networks with 32-bit weights require a lot of computing resources, making it difficult for large-scale deep neural networks to be deployed in limited computing power scenarios (such as edge computing). In order to solve this problem, a plug-and-play neural network quantification method was proposed to reduce the computational cost of large-scale neural networks and keep the model performance away from significant reduction. Firstly, the high-frequency and low-frequency components of the input feature map were separated based on Octave convolution. Secondly, the convolution kernels with different bits were respectively applied to the high- and low-frequency components for convolution operation. Thirdly, the high- and low-frequency convolution results were quantized to the corresponding bits by using different activation functions. Finally, the feature maps with different precisions were mixed to obtain the output of the layer. Experimental results verify the effectiveness of the proposed method on model compression. When the model was compressed to 1+8 bit(s), the proposed method had the accuracy dropped less than 3 percentage points on CIFAR-10/100 dataset; moreover, the proposed method made the ResNet50 structure based model compressed to 1+4 bit(s) with the accuracy higher than 70% on ImageNet dataset.

Reference | Related Articles | Metrics

Select

APP component recognition method based on object detection

ZHANG Wenye

Journal of Computer Applications DOI: 10.11772/j.issn.1001-9081.2019081420
Accepted: 30 October 2019